Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality. Virseqimprover first examines whether there is any chimeric sequence based on read coverage, breaks the sequence into segments if there is, then extends the longest segment with uniform depth of coverage, and repeats these procedures until the sequence cannot be extended. Finally, Virseqimprover annotates the gene content of the resulting sequence. Results show that Virseqimprover has good performances on correcting and extending viral contigs to their full lengths, hence can be a useful tool to improve the completeness and minimize the assembly errors of viral contigs. Both a web server and a conda package for Virseqimprover are provided to the research community free of charge.more » « lessFree, publicly-accessible full text available January 1, 2026
-
Abstract Viruses of the phylumNucleocytoviricota, often referred to as “giant viruses,” are prevalent in various environments around the globe and play significant roles in shaping eukaryotic diversity and activities in global ecosystems. Given the extensive phylogenetic diversity within this viral group and the highly complex composition of their genomes, taxonomic classification of giant viruses, particularly incomplete metagenome-assembled genomes (MAGs) can present a considerable challenge. Here we developed TIGTOG (TaxonomicInformation ofGiant viruses usingTrademarkOrthologousGroups), a machine learning-based approach to predict the taxonomic classification of novel giant virus MAGs based on profiles of protein family content. We applied a random forest algorithm to a training set of 1531 quality-checked, phylogenetically diverseNucleocytoviricotagenomes using pre-selected sets of giant virus orthologous groups (GVOGs). The classification models were predictive of viral taxonomic assignments with a cross-validation accuracy of 99.6% at the order level and 97.3% at the family level. We found that no individual GVOGs or genome features significantly influenced the algorithm’s performance or the models’ predictions, indicating that classification predictions were based on a comprehensive genomic signature, which reduced the necessity of a fixed set of marker genes for taxonomic assigning purposes. Our classification models were validated with an independent test set of 823 giant virus genomes with varied genomic completeness and taxonomy and demonstrated an accuracy of 98.6% and 95.9% at the order and family level, respectively. Our results indicate that protein family profiles can be used to accurately classify large DNA viruses at different taxonomic levels and provide a fast and accurate method for the classification of giant viruses. This approach could easily be adapted to other viral groups.more » « less
-
Abstract Phages (viruses of bacteria and archaea) are a ubiquitous top-down control on microbial communities by selectively infecting and killing cells. As obligate parasites, phages are inherently linked to processes that impact their hosts’ distribution and physiology, but phages can also be impacted by external, environmental factors, such as UV radiation degrading their virions. To better understand these complex links of phages to their hosts and the environment, we leverage the unique ecological context of the Isthmus of Panama, which narrowly disconnects the productive Tropical Eastern Pacific (EP) and nutrient-poor Tropical Western Atlantic (WA) provinces. We could thus compare patterns of phage and prokaryotic communities at both global scales (between oceans) and local-scales (between habitats within an ocean). Although both phage and prokaryotic communities differed sharply between the oceans, phage community composition did not significantly differ between mangroves and reefs of the WA, while prokaryotic communities were distinct. These results suggest phages are more shaped by dispersal processes than local conditions regardless of spatial scale, while prokaryotes tend to be shaped by local conditions at smaller spatial scales. Collectively, we provide a framework for addressing the co-variability between phages and prokaryotes in marine systems and identifying factors that drive consistent versus disparate trends in community shifts, essential to informing models of biogeochemical cycles that include these interactions.more » « less
-
Since the discovery of the first “giant virus,” particular attention has been paid toward isolating and culturing these large DNA viruses throughAcanthamoebaspp. bait systems. While this method has allowed for the discovery of plenty novel viruses in theNucleocytoviricota, environmental -omics-based analyses have shown that there is a wealth of diversity among this phylum, particularly in marine datasets. The prevalence of these viruses in metatranscriptomes points toward their ecological importance in nutrient turnover in our oceans and as such, in depth study into non-amoebalNucleocytoviricotashould be considered a focal point in viral ecology. In this review, we report onKratosvirus quantuckense(née Aureococcus anophagefferens Virus), an algae-infecting virus of theImitervirales. Current systems for study in theNucleocytoviricotadiffer significantly from this virus and its relatives, and a litany of trade-offs within physiology, coding potential, and ecology compared to these other viruses reveal the importance ofK. quantuckense. Herein, we review the research that has been performed on this virus as well as its potential as a model system for algal-virus interactions.more » « less
-
Abstract Viruses of the phylum Nucleocytoviricota are ubiquitous in ocean waters and play important roles in shaping the dynamics of marine ecosystems. In this study, we leveraged the bioGEOTRACES metagenomic dataset collected across the Atlantic and Pacific Oceans to investigate the biogeography of these viruses in marine environments. We identified 330 viral genomes, including 212 in the order Imitervirales and 54 in the order Algavirales. We found that most viruses appeared to be prevalent in shallow waters (<150 m), and that viruses of the Mesomimiviridae (Imitervirales) and Prasinoviridae (Algavirales) are by far the most abundant and diverse groups in our survey. Five mesomimiviruses and one prasinovirus are particularly widespread in oligotrophic waters; annotation of these genomes revealed common stress response systems, photosynthesis-associated genes, and oxidative stress modulation genes that may be key to their broad distribution in the pelagic ocean. We identified a latitudinal pattern in viral diversity in one cruise that traversed the North and South Atlantic Ocean, with viral diversity peaking at high latitudes of the northern hemisphere. Community analyses revealed three distinct Nucleocytoviricota communities across latitudes, categorized by latitudinal distance towards the equator. Our results contribute to the understanding of the biogeography of these viruses in marine systems.more » « less
-
Although traditionally viewed as streamlined and simple, discoveries over the last century have revealed that viruses can exhibit surprisingly complex physical structures, genomic organization, ecological interactions, and evolutionary histories. Viruses can have physical dimensions and genome lengths that exceed many cellular lineages, and their infection strategies can involve a remarkable level of physiological remodeling of their host cells. Virus–virus communication and widespread forms of hyperparasitism have been shown to be common in the virosphere, demonstrating that dynamic ecological interactions often shape their success. And the evolutionary histories of viruses are often fraught with complexities, with chimeric genomes including genes derived from numerous distinct sources or evolved de novo. Here we will discuss many aspects of this viral complexity, with particular emphasis on large DNA viruses, and provide an outlook for future research.more » « less
-
Abstract Recent research has underscored the immense diversity and key biogeochemical roles of large DNA viruses in the ocean. Although they are important constituents of marine ecosystems, it is sometimes difficult to detect these viruses due to their large size and complex genomes. This is true for “jumbo” bacteriophages, which have genome sizes >200 kbp and large capsids reaching up to 0.45 µm in diameter. In this study, we sought to assess the genomic diversity and distribution of these bacteriophages in the ocean by generating and analyzing jumbo phage genomes from metagenomes. We recover 85 marine jumbo phages that ranged in size from 201 to 498 kilobases, and we examine their genetic similarities and biogeography together with a reference database of marine jumbo phage genomes. By analyzing Tara Oceans metagenomic data, we show that although most jumbo phages can be detected in a range of different size fractions, 17 of our bins tend to be found in those greater than 0.22 µm, potentially due to their large size. Our network-based analysis of gene-sharing patterns reveals that jumbo bacteriophages belong to five genome clusters that are typified by diverse replication strategies, genomic repertoires, and potential host ranges. Our analysis of jumbo phage distributions in the ocean reveals that depth is a major factor shaping their biogeography, with some phage genome clusters occurring preferentially in either surface or mesopelagic waters, respectively. Taken together, our findings indicate that jumbo phages are widespread community members in the ocean with complex genomic repertoires and ecological impacts that warrant further targeted investigation.more » « less
-
Casadesús, Josep (Ed.)The evolutionary forces that determine genome size in bacteria and archaea have been the subject of intense debate over the last few decades. Although the preferential loss of genes observed in prokaryotes is explained through the deletional bias, factors promoting and preventing the fixation of such gene losses often remain unclear. Importantly, statistical analyses on this topic typically do not consider the potential bias introduced by the shared ancestry of many lineages, which is critical when using species as data points because of the potential dependence on residuals. In this study, we investigated the genome size distributions across a broad diversity of bacteria and archaea to evaluate if this trait is phylogenetically conserved at broad phylogenetic scales. After model fit, Pagel’s lambda indicated a strong phylogenetic signal in genome size data, suggesting that the diversification of this trait is influenced by shared evolutionary histories. We used a phylogenetic generalized least-squares analysis (PGLS) to test whether phylogeny influences the predictability of genome size from dN/dS ratios and 16S copy number, two variables that have been previously linked to genome size. These results confirm that failure to account for evolutionary history can lead to biased interpretations of genome size predictors. Overall, our results indicate that although bacteria and archaea can rapidly gain and lose genetic material through gene transfers and deletions, respectively, phylogenetic signal for genome size distributions can still be recovered at broad phylogenetic scales that should be taken into account when inferring the drivers of genome size evolution.more » « less
-
The gut of the European honey bee (Apis mellifera)possesses a relatively simple bacterial community, but little is known about its community of prophages (temperate bacteriophages integrated into the bacterial genome). Although prophages may eventually begin replicating and kill their bacterial hosts, they can also sometimes be beneficial for their hosts by conferring protection from other phage infections or encoding genes in metabolic pathways and for toxins. In this study, we explored prophages in 17 species of core bacteria in the honey bee gut and two honey bee pathogens. Out of the 181 genomes examined, 431 putative prophage regions were predicted. Among core gut bacteria, the number of prophages per genome ranged from zero to seven and prophage composition (the compositional percentage of each bacterial genome attributable to prophages) ranged from 0 to 7%.Snodgrassella alviandGilliamella apicolahad the highest median prophages per genome (3.0 ± 1.46; 3.0 ± 1.59), as well as the highest prophage composition (2.58% ± 1.4; 3.0% ± 1.59). The pathogenPaenibacillus larvaehad a higher median number of prophages (8.0 ± 5.33) and prophage composition (6.40% ± 3.08) than the pathogenMelissococcus plutoniusor any of the core bacteria. Prophage populations were highly specific to their bacterial host species, suggesting most prophages were acquired recently relative to the divergence of these bacterial groups. Furthermore, functional annotation of the predicted genes encoded within the prophage regions indicates that some prophages in the honey bee gut encode additional benefits to their bacterial hosts, such as genes in carbohydrate metabolism. Collectively, this survey suggests that prophages within the honey bee gut may contribute to the maintenance and stability of the honey bee gut microbiome and potentially modulate specific members of the bacterial community, particularlyS. alviandG. apicola.more » « less
An official website of the United States government
